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CROSS-REFERENCE TO RELATED APPLICATIONS 

The present application is related to the following co-pending applications: Serial 
5 No.XXXX (Attorney Docket D/A1270) entitled "Digital De-Screening of Documents", 
Serial No.XXXX (Attorney Docket D/A1270Q) entitled "Control System for Digital De- 
Screening of Documents", Serial No.XXXX (Attorney Docket D/A1271) entitled 
"Segmentation Method and System for Multiple Raster Content (MRC) Representation 
of Documents", Serial No.XXXX (Attorney Docket D/A1271Q) entitled "Dynamic 
10 Threshold System for Multiple Raster Content (MRC) Representation of Documents", all 
filed July 01, 2002, on the same date as the present application and commonly 
assigned to the present assignee, the contents of which are herein incorporated by 
reference. 

BACKGROUND OF THE INVENTION 
1 5 FIELD OF THE INVENTION 

The present invention relates generally to methods and systems for segmenting 
digitally scanned documents into two or more planes, and more particularly to methods 
and systems for segmenting digitally scanned documents into planes suitable for a 
Multiple Raster Content (MRC) representation of documents. 

20 DESCRIPTION OF RELATED ART 

The MRC representation of documents is versatile. It provides the ability to 
represent color images and either color or monochrome text. The MRC representation 
enables the use of multiple "planes" for the purpose of representing the content of 
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documents. The MRC representation is becoming increasingly important in the 
marketplace. It has been already established as the main color-fax standard. 

In an MRC representation, an image is represented by more than one image 
plane. The main advantage of the MRC representation of documents is to provide an 
5 efficient way to store, transmit, and manipulate large digital color documents. The 
method exploits the properties of the human vision system, where the ability to 
distinguish small color variations is greatly reduced in the presence of high-contrast 
edges. The edge information is normally separated from the smoothly varying color 
information, and encoded (possibly at higher resolution than 1 bit per pixel) in one of the 
10 planes, called the Selector plane. Following a careful separation, the various planes 
could be independently compressed using standard compression schemes (such as 
JPEG and G4) with good compression and high quality at the same time. 

There is a need for a method and a system for efficiently separating an image 
into a set of planes, such that the advantages of the MRC representation can be fully 
15 exploited. 

SUMMARY OF THE INVENTION 

A method and a system for separating an image signal into a set of image planes in 
accordance with a control signal are disclosed. The system comprises a selector 
module, an edge processing module and a foreground/background separation module. 

20 The selector module receives the control signal and produces a selector signal. The 
edge processing module receives the selector signal and produces a decision signal. 
The foreground/background separation module receives the image signal and the 
decision signal, and outputs a foreground signal and a background signal, a 
representation of the current pixel of the image signal being included in at least one of 

25 the foreground signal and the background signal in accordance with the decision signal. 



2 



BRIEF DESCRIPTION OF THE DRAWINGS 

The features and advantages of the present invention will become apparent from 
the following detailed description of the present invention in which: 

FIG. 1 illustrates the MRC structure for documents. 

5 FIG. 2 shows the block diagram of the system of the present invention. 

FIG. 3 shows the block diagram of an embodiment of the system of the present 
invention. 

FIG. 4 illustrates the function of the Dependent Min-Max block E1 used in one 
embodiment of the system of the present invention. 

10 FIG. 5 illustrates the function of the Dependent Min-Max Sub-Sample block E2 used in 
one embodiment of the system of the present invention. 

FIG. 6 illustrates the functions of the Dependent Max block E3 and Dependent Min 
block E4 used in one embodiment of the system of the present invention. 

FIG. 7 illustrates the two window contexts employed by one embodiment of the 
15 Dynamic Threshold module. 

FIG. 8 shows the block diagram of one embodiment of the Dynamic Threshold module. 

FIG. 9 shows an implementation of the comparator logic block included in one 
embodiment of the Dynamic Threshold module. 

FIG. 10 shows the truth table of the comparator logic block of FIG. 9. 

20 FIG. 11 shows an implementation of the selector logic module included in one 
embodiment of the Dynamic Threshold module. 

FIG. 12 illustrates the function of the Edge Processing block included in the Separation 
module. 

FIG. 13 illustrates the decision range used by the Separation module for separating the 
25 image signal into the Background and Foreground planes. 
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FIG. 14 shows a block diagram of one implementation of the FG/BG Cleanup block 
included in one embodiment of the Separation module. 

FIG. 15 illustrates the dilate operation used in one implementation of the FG/BG 
Cleanup block included in one embodiment of the Separation module. 

5 FIG. 16 is a graphical illustration of equations (1) through (4). 

FIG. 17 is a graphical illustration of equations (6) through (9). 

FIG. 18 shows an exemplary structure of the halftone estimate module. 

FIG. 19 shows a min-max detection scheme used by the min-max detection modules 
included in the halftone estimate module of FIG. 18. 

10 FIG. 20 illustrates the equations that implement the halftone weight module included in 
the halftone estimate module. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides a method and a system for separating an image 
signal into a set of image planes. The image signal represents a digitally scanned 
15 document. The image planes are suitable for a Mixed Raster Content (MRC) 
representation of the digitally scanned document. 

FIG. 1 shows the general MRC representation. The representation comprises up 
to four independent planes: Foreground, Background, Selector, and Rendering Hints. 
In the most general case, there could be multiple Foreground and Selector pairs at 

20 higher levels. However, in most applications, the representation is limited to three or 
four planes. The Background plane is typically used for storing continuous-tone 
information such as pictures and/or smoothly varying background colors. The Selector 
plane normally holds the image of text (binary) as well as other edge information (e.g., 
line art drawings). The Foreground plane usually holds the color of the corresponding 

25 text and/or line art. However, the MRC representation only specifies the planes and 
their associated compression methods. It does not otherwise restrict nor enforce the 
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content of each of the planes. The content of each of the planes may be defined 
appropriately by an implementation of the MRC representation. 

The MRC structure also allows for a fourth plane, the Rendering Hints plane, 
which is used for communicating additional information about the content of the 
5 document. For example, the Rendering Hints plane may carry the ICC (International 
Color Consortium) color hints that identify the best color matching strategy for the 
various objects on the page. 

The Foreground and Background planes are defined to be two full-color (L, a, b) 
planes. The Selector plane is defined as a binary (1-bit deep) plane. The Rendering 

10 Hints plane is typically restricted to an 8-bit plane. One exemplary MRC representation 
specifies that the Foreground and Background are to be JPEG compressed, and that 
the Selector plane is to be ITU-G4 compressed (standard Group 4 facsimile 
compression). The Rendering Hints plane is considered to be optional, but if one is 
used, a compression scheme similar to the Lempel-Zev-Welch scheme may be used for 

15 its compression. In general, the Foreground, Background, Selector and Rendering 
Hints planes can all be at different resolutions, and they are not required to maintain th 
original source input resolution. 

The method for assembling back a "segmented" MRC image from its 
components (i.e., planes) is by "pouring" the Foreground colors through the Selector 

20 plane "mask" on top of the Background plane, thus overwriting the previous content of 
the Background plane at these locations. In other words, the assembly is achieved by 
multiplexing between the Foreground and Background information on a pixel by pixel 
basis, based on the binary control signal of the Selector plane. For example, if the 
Selector value is 1, the content of Foreground is used; otherwise (i.e., for Selector value 

25 = 0) the content of Background is used. The multiplexing operation is repeated on a 
pixel by pixel basis until all of the output pixels have been defined. 

The main advantage of the MRC representation of documents is to provide an 
efficient way to store, transmit, and manipulate large digital color documents. The 



5 



method exploits the properties of the human vision system, where the ability to 
distinguish small color variations is greatly reduced in the presence of high-contrast 
edges. The edge information is normally separated from the smoothly varying color 
information, and encoded (possibly at higher resolution than 1 Selector sample per 
5 source pixel) in the Selector plane. Following a careful separation, the various planes 
could be independently compressed using standard compression schemes (such as 
JPEG and G4) with good compression and high quality at the same time. 

The Segmentation system of the present invention is used for splitting an 
incoming image into three or more planes suitable for an MRC representation of the 
10 image. 

FIG. 2 shows a block diagram of the Segmentation system of the present 
invention. Segmentation system 200 comprises a Min-Max module 202, a Dynamic 
Threshold module 204 and a Separation module 206. The Min-Max module 202 
receives the image signal DSC, searches for minima and maxima within a set of 

15 windows centered on a pixel in the image signal. The Dynamic Threshold module 204 
computes, for each of the windows, based on the minima and maxima received from the 
min-max module and the current pixel, a respective indicator representing the distance 
and direction of the current pixel relative to a respective threshold plane, and outputs a 
control signal based on the indicators. The Separation module 206 separates the 

20 image signal into the set of image planes in accordance with the control signal by 
including a representation of the current pixel in at least one of the image planes. 

FIG. 3 shows a block diagram of one embodiment 300 of the Segmentation 
system 200. 

For best performance of the Segmentation system 300, the input signal DSC 
25 should be free of most of the original mid-frequency halftone patterns of the original 
scanned image. These halftone frequencies are typically eliminated by passing the 
input image through a de-screen system first. However, in some situations, such as for 
clean PDL (Page Description Language) printing, the input signal may be known to be 
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free of problematic halftone frequencies. In such situations, the de-screen operation is 
not needed and the clean input signal can be directly fed into the Segmentation system. 

For ease of explanation, in the description of the Segmentation system 300 
herein, the source input image DSC, as well as the Foreground FG and Background BG 
5 outputs, are all assumed to be full-color (L, a, b) planes, while the Selector plane SEL 
output is binary (1-bit). It is understood that these assumptions are not to be construed 
as limitations of the applications of the present invention. 

In general, the Foreground, Background, and Selector planes could all be at 
different resolutions relative to the input image DSC. For example, the Foreground and 
10 Background planes are typically down-sampled (for better compression) while the 
Selector plane is typically up-sampled (for better edge quality) from the original input 
resolution. The amount of up or down sampling may be fully programmable under 
software control. 

The Segmentation system 300 may also receive and use the optional estimated 
15 frequency Halftone Weight HTW and full color Super Blur BLR_A signals when they are 
available. These optional signals may be generated by a de-screen or filtering system 
such as the one described in a co-pending patent application. The optional full color 
Super Blur BLR_A signal may be generated by lowpass filtering the image source 
signal with a filter that has a very large filter span (i.e., very low cut-off frequency). The 
20 optional estimated frequency Halftone Weight HTW will be described in detail later in 
connection with FIG. 18, FIG. 19 and FIG. 20. 

The Segmentation system 300 comprises a Min-Max module 310, a Dynamic 
Threshold module 320, and a Separation module 330. 

The Min-Max module 310 comprises a Dependent Min-Max block E1, a 
25 Dependent Min-Max Sub-Sample block E2 and two Dependent Min-Max blocks E3, E4. 
The Min-Max module 310 receives the input image signal DSC (3-dimensional), 
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computes and outputs two sets of maximum and minimum vectors (Mx, Mn), (MX, MN), 
each set corresponding to a different window. 

The Dynamic Threshold module 320 receives the input image signal DSC, and 
the vectors (Mx, Mn), (MX, MN) from the Min-Max module 310 and computes, for each 
5 of the windows, based on the respective minimum and maximum received from the min- 
max module and the current pixel, a respective indicator representing the distance and 
direction of the current pixel relative to a respective threshold plane, and outputs based 
on the indicators a control signal GRS to the Separation module 330. The optional 
control signals SEG, ENH may also be outputted. The Dynamic Threshold module 320 
10 also receives the optional estimated frequency Halftone Weight HTW and full color 
Super Blur BLR_A signals when they are available. 

The Separation module 330 comprises a Selector Logic block E6, an Edge 
Processing block E7, a FG/BG Separation block E8, and a FG/BG Cleanup block E9. 
The Separation module 330 receives the image signal DSC, the vectors Mx, Mn from 
15 the Min-Max module 310, the control signal GRS and the optional control signals SEG, 
ENH from the Dynamic Threshold module 320, and outputs the three signals BG, FG, 
SEL, which correspond to the Background, Foreground, and Selector planes of an MRC 
representation of the image DSC, respectively. 

The Dependent Min-Max block E1 receives the input image signal DSC, and 
20 searches in a 5x5 window centered on the current pixel of interest for the minimum 
value (vector) Mn and maximum value (vector) Mx. The vectors Mn and Mx represent 
the minimum and maximum in the window context of 5x5 pixels. The meaning of these 
rectors will be described in detail later. 

The Dependent Min-Max Sub-Sample block E2 receives the input image signal 
25 DSC, and searches for the minimum and maximum luminance value in each of the non- 
overlapping 8x8 windows, and also provides the corresponding chroma values at these 
locations. By using non-overlapping 8x8 windows, the Dependent Min-Max Sub- 
Sample block E2 effectively sub-samples the minimum and maximum values by a factor 
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of 8 in each direction, thus reducing the overall bandwidth by a factor of 64. The sub- 
sampled outputs are then fed to the two Dependent Min-Max blocks E3 and E4, which 
search for the minimum and maximum vectors MN and MX over a 9x9 window centered 
on the original (before sub-sampling) 8x8 window that contains the current pixel of 
interest. Thus, the MN and MX vectors correspond to the minimum of all the minima 
and the maximum of all the maxima from the non-overlapping 8x8 windows, 
respectively. Due to the sub-sampling (by 8) effect, the 9x9 window actually 
corresponds to a window context of 72x72 pixels. It is noted that capital letters are used 
for vectors MN and MX to distinguish them from the vectors Mn and Mx (outputs of 
block E1) and to indicate that they represent the minimum and maximum in the larger 
window context of 72x72 pixels overall. 

The two sets of minimum and maximum vectors (Mn, Mx) and (MN, MX) are fed 
to the Dynamic Threshold Module 320. The Dynamic Threshold Module 320 outputs 
the monochrome 8-bit signal GRS whose biased zero crossings represent the locations 
of edges in the Selector planes. In addition, The Dynamic Threshold Module may also 
generate the optional binary control signal SEG and the optional 8-bit segmentation 
enhancement control ENH. The optional binary control signal SEG provides an external 
means (similar to that of an override switch) to control the segmentation operation of the 
FG/BG Separation block E8 of Separation module 330 (see equations (14) through 
(20)). The optional 8-bit segmentation enhancement control ENH provides to the 
FG/BG Separation block E8 the amount of enhancement to apply. 

The Selector Logic block E6 receives the 8-bit Gray Selector signal GRS from 
the Dynamic Threshold Module 320, up-samples it by doubling the resolution, and then 
thresholds it at the zero crossings to produce the binary Selector plane output SEL. For 
high-quality text and line-art reproduction, the Selector plane is typically kept at twice 
the input resolution (1200 dpi for a 600 dpi input), although it could be programmed for 
even higher ratio (in one implementation, up to 8 times the input resolution) under 
software control. 
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But in applications that do not require very high quality, the Selector plane could 
be at the same resolution as the input signal DSC. The Edge Processing block E7 
receives the high resolution Selector output SEL and counts the number of ON and 
OFF pixels in a 5x5 (high-resolution) window centered on the current (low-resolution) 
5 pixel of interest. The Edge Processing block E7 outputs the two-bit signal SEE. The 
SEE signal is set to 0 if all of the input pixels inside the 5x5 window are OFF 
(corresponding to a 5x5 constant Background area). Similarly, the SEE signal is set to 
3 if all of the input pixels inside the window are ON (corresponding to a 3x3 constant 
Foreground area). The SEE output is set to 1 or 2 if the 3x3 window is mostly 
10 Background (white) or mostly Foreground (black), respectively. 

The FG/BG Separation block E8 receives the full color source signal DSC to be 
segmented, the full color minimum and maximum vectors Mn, Mx from the Dependent 
Min-Max block E1 , the SEE signal from the Edge Processing block E7, the optional 
segmentation signal SEG, and the enhancement control signal ENH from the Dynamic 
15 Threshold Module 320. The FG/BG Separation block E8 performs the MRC 
segmentation to generate the Foreground and Background information, and produces 
two full-color outputs Fgr and Bgr as the rough estimates of the Foreground and 
Background planes, respectively. 

The FG/BG Cleanup block E9 applies additional processing on the rough 
20 Foreground and Background estimates Fgr and Bgr to generate the final Foreground 
and Background outputs FG and BG. This additional processing is to slightly extend 
the Foreground and Background values beyond the edges and to fill in the undefined 
pixels in the Foreground and Background planes with appropriate values. The purpose 
of this processing is to prevent artifacts that may result from a subsequent sampling and 
25 JPEG compression and to fill in the yet-undefined pixels with values that will result in 
good JPEG compression ratio. 

An additional logic inside the FG/BG Cleanup block E9 (see Tile Tag block F7 of 
FIG. 14) also monitors the Foreground and Background output values to detect and flag 
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tiles that are almost ail-black or all-white. Rather than encode the output from such tiles 
into the output file, a special tile marker is used and referenced whenever such a tile is 
detected. This increases the overall compression ratio by eliminating the need to 
repeatedly encode the common all-white or all-black tiles. 

5 The blocks included in the Min-Max module 310 will be discussed in detail in the 

following. 

The Dependent Min-Max block E1 looks for the maximum and minimum values 
of the luminance component L in a 5x5 window centered on the current pixel of interest, 
and outputs the full-color (luminance and chrominance) values at these locations. It is 
10 called a Dependent Min-Max to indicate that it only searches for the minimum and 
maximum over a single component, which is the luminance L, and not over all three 
components of the image signal DSC. Once the locations of the minimum and 
maximum luminance are found, the chroma components (a, b) at these locations are 
also outputted. 

15 The Dependent Min-Max block E1 outputs two vectors of full-color (L, a, b) 

signals Mn = (L Mn , a Mn , b Mn ), and Mx - (Uu, a Mx , b Mx ), corresponding to the minimum 
and maximum values in the 5x5 window, respectively. The outputs Mn and Mx are at 
the same pixel rate as the input signal DSC. 

FIG. 4 illustrates the operation of the Dependent Min-Max block E1. The content 
20 of the DSC luminance data is first searched in a 5x5 luminance window centered on the 
current pixel of interest to find the locations of the smallest and largest L values. If the 
minimum or maximum L values are not unique (that is, if there is more than one location 
having the same minimum or maximum value), the location of the one first encountered 
is used. The output of this search process is a unique pair (L Mn , Lmx) of the minimum 
25 and maximum L values as well as their relative location within the 5x5 window. 

The Dependent Min-Max block E1 then uses the relative location information to 
index the corresponding chroma (a, b) components in the two corresponding 5x5 
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chroma windows and retrieve the chroma values at these locations. Thus, the relative 
location of the maximum L value L Mx is used to address the 5x5 chroma windows and 
retrieve the chroma pair (a Mx , b Mx ) at this location. Together, the triplet (L Mx , a Mx , tW 
forms the output Mx from the Dependent Min-Max block E1. Similarly, the relative 
5 location of the minimum L value L Mn is used to address the 5x5 chroma windows and 
retrieve the chroma pair (a Mn , b M n) at this location. The triplet (L M n, a Mn , b Mn ) forms the 
output Mn from the Dependent Min-Max block E1 . 

The implementation of the Dependent Min-Max block E1 can be greatly 
accelerated by taking advantage of the sequential nature of the operation and the type 

10 of operations (min-max) that is being performed. For example, as the operation is 
advanced to the subsequent pixel, the extreme values (i.e., maximum and minimum) 
and corresponding locations for the previous pixel are already known. Since the current 
5x5 window greatly overlaps the previous window, by keeping track of the previous 
window content, the Dependent Min-Max block E1 has to sort out only the newest and 

15 oldest 5x1 columns of L values on either side of the previous window. The center 3x5 
area is common to both previous window and current window, and the new address 
locations of the previous minimum and maximum values in the previous window are at 
an offset of 1 in the fast scan direction relative to their previous locations. The previous 
minimum and maximum are compared to values in the newest column of L values to 

20 yield the new maximum and minimum L values. 

FIG. 5 illustrates the Dependent Min-Max Sub-Sample block E2. Block E2 
receives the full-color (L, a, b) input signal DSC and produces two full-color sub- 
sampled minimum and maximum outputs 502 and 504. Block E2 searches for the 
minimum and maximum luminance values over non-overlapping 8x8 windows. The 
25 locations of the minimum and maximum luminance values are then used to index the 
chroma windows and retrieve the corresponding chroma values at these locations. 

By using non-overlapping 8x8 windows, the operation of the Dependent Min-Max 
Sub-Sample block E2 is effectively sub-sampling the min and max outputs (that would 
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have been produced had a sliding window been used instead of non-overlapping 
windows) by a factor of 8 in each direction, thereby reducing the overall output data rate 
by a factor of 64. 

The minimum output 504 corresponds to the triplet (L M in, a M iN, b M iN) formed by 
5 the minimum luminance value Lmin of the input signal DSC within the 8x8 window 
containing the current pixel of interest, and the corresponding chroma (a, b) values 
(aMiN, b M iN) at this minimum luminance location. Similarly, the maximum output 502 
corresponds to the triplet (L M ax, aMAx, bMAx) formed by the maximum luminance value 
Lmax of the input signal DSC within the 8x8 window containing the current pixel of 
10 interest, and the corresponding chroma (a, b) values (aMAx, bMAx) at this maximum 
luminance location. If the minimum or maximum luminance values are not unique (i.e., 
if there is more than one location with the same maximum or minimum values), the one 
first encountered is used. 

The sub-sampling operation is achieved by advancing the current pixel position 
15 by 8 in the fast scan direction (and also, upon reaching the end of a line, by 8 lines in 
the slow scan direction) to maintain the non-overlapping windows condition. 

The 8-times (abbreviated as 8x) reduction factor (in each dimension) of the 
Dependent Min-Max Sub-Sample E2 block is designed in accordance with the amount 
of sub-sampling desired for the Foreground and Background planes (normally a sub- 
20 sampling factor of 2). For higher output image quality (as is the case with a clean PDL 
input image, for example), it may be desirable to not sub-sample the Foreground and 
Background outputs at all. In such a case, a smaller amount of sub-sampling factor 
(e.g., only 4x) is to be applied instead of the 8x factor above. If a sub-sampling factor of 
4 (for each direction) is to be applied, 4x4 non-overlapping windows are used. 

25 The Dependent Min-Max Sub-Sample block E2 is used in conjunction with the 

two Dependent Min and Max Units E3 and E4 to produce a Min-Max analysis similar to 
that of the Dependent Min-Max block E1, but covering a much larger area context 
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(72x72 pixels as compared to 5x5 pixels) and at a coarser resolution to reduce the 
overall bandwidth. 

FIG. 6 illustrates the functions of the Dependent Max block E3 and Dependent 
Min block E4 as used in one embodiment of the system of the present invention. 

5 The Dependent Max block E3 receives the full-color dependent maximum output 

502 from the Dependent Min-Max Sub-Sample block E2, searches the content of the 
luminance data in the signal 502 in a 9x9 luminance window centered on the current 
pixel of interest to find the location of the maximum L value. If the maximum L value is 
not unique (that is, if there is more than one location having the same maximum value), 
10 the location of the one first encountered is used. The output of this search process is 
the maximum value L M x as well as its relative location within the 9x9 window. 

The Dependent Max block E3 then uses the relative location information of Lmx to 
index the corresponding chroma (a, b) components in the two corresponding 9x9 
chroma windows and retrieve the chroma values at this location. Thus, the relative 
15 location of the maximum L value L M x is used to address the 9x9 chroma windows and 
retrieve the chroma pair (a M x, b M x) at this location (as illustrated in FIG. 6). The triplet 
(Lmx, a M x, b M x) forms the output MX of the Dependent Max block E3. 

The Dependent Min block E4 receives the full-color dependent minimum output 
504 from the Dependent Min-Max Sub-Sample block E2, searches the content of the 
20 luminance data in the signal 504 in a 9x9 luminance window centered on the current 
pixel of interest to find the location of the minimum L value. If the minimum L value is 
not unique (that is, if there is more than one location having the same minimum value), 
the location of the first one encountered is used. The output of this search process is 
the minimum value L M n as well as its relative location within the 9x9 window. 

25 The Dependent Min block E4 then uses the relative location information of L M n to 

index the corresponding chroma (a, b) components in the two corresponding 9x9 
chroma windows and retrieve the chroma values at this location. Thus, the relative 
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location of the minimum L value L M n is used to address the 9x9 chroma windows and 
retrieve the chroma pair (a M N, b M N> at this location (as illustrated in FIG. 6). The triplet 
(Lmn, a M N, b M N> forms the output MN of the Dependent Min block E4. 

By applying the Dependent Min block E4 on the dependent minimum output 504 
5 of the Dependent Min-Max Sub-Sample block E2, the dependent minimum operation is 
effectively extended over a larger area to provide a dependent minimum analysis (the 
MN is minimum of minima received from block E2). Similarly, the Dependent Max block 
effectively provides a dependent maximum analysis over the extended area (the MX is 
maximum of maxima received from block E2). Since both inputs 502 and 504 are 
10 already sub-sampled by a factor of 8 in each direction (as compared to the original pixel 
resolution of input image DSC), the equivalent window area for each of the dependent 
minimum MN and maximum MX is 72x72 pixels at the original pixel resolution. 

The Dynamic Threshold module 320 applies adaptive thresholding to the 
incoming source signal DSC to generate a monochrome 8-bit gray signal GRS output, 

15 whose zero crossings represent the edges in the Selector plane. The Dynamic 
Threshold module 320 utilizes the two sets of min/max values (Mn, Mx) and (MN, MX) 
from the 5x5 fine- and 9x9 coarse resolution windows, and may also receive the 
Halftone Weight estimate HTW and the Super Blur BLR_A signals, when they are 
available. The Dynamic Threshold module 320 produces the gray selector signal GRS, 

20 the binary segmentation signal SEG and the 8-bit signal ENH, which is used to 
communicate the amount of segmentation enhancement to apply in the FG/BG 
Separation block E8. 

FIG. 7 illustrates the three available choices of context area: the Single Pixel 
area which is the area of the current pixel of interest, the 5x5 High-Resolution window 
25 W1, and the 9x9 Coarse Resolution window W2. Recall that the 9x9 window context 
W2 corresponds to a window of 72x72 pixels sub-sampled by 8 in each direction. Each 
square (pixel) in the 9x9 coarse resolution window W2 represents an extremum in a 
window of 8x8 original pixels (i.e., pixels at the original pixel resolution). The Dynamic 
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Threshold module 320 uses these three predefined context areas in the process of 
determining the gray selector signal GRS. 

The Single Pixel (current pixel) area is used when no contrast activity (described 
below) exists in both the 5x5 window W1 and 9x9 window W2, in which case the 
5 luminance of the incoming signal DSC is merely thresholded and the chroma (a, b) 
components are not used. Otherwise, the 5x5 High-Resolution and 9x9 Coarse 
Resolution areas are used in combination to track and segment the incoming signal 
DSC based on the level of activity in the windows. Activity in the 5x5 window indicates 
the presence of an image edge in that window. Activity in the 9x9 window indicates that 

10 an edge is either approaching the small window or leaving the small window. Thus, the 
large 9x9 window serves as a look ahead feature. It also provides the history of where 
an edge has been. This allows proper setting of the SEE signal (to be described later). 
The large 9x9 window could be replaced by other embodiment that serves the same 
purposes. The operation of tracking and segmenting the incoming signal DSC based 

15 on the level of activity in the windows will be described below. 

FIG. 8 shows a block diagram of an embodiment 800 of the Dynamic Threshold 
Module 320. The embodiment 800 comprises three logic blocks 810, 820, 830, and a 
decision module 840. 

The three logic blocks 810, 820, 830 correspond to the three possible context 
20 windows shown in FIG. 7, i.e., the Single Pixel area, the 5x5 High-Resolution window 
W1 , and the 9x9 Coarse Resolution window W2, respectively. 

The multiplexer MUX can select and pass one of these outputs as the final GRS 
output signal. The selection can be switched on a pixel-by-pixel basis based on the 2-bit 
signal SEL. The actual selection code for each of the inputs is shown in FIG. 8 to the 
25 right of the input arrows. 

For the case of a Single Pixel context, the luminance component of the incoming 
input signal DSC is merely biased by subtracting from it a pre-determined 8-bit constant 
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THR, using the adder 815. The value of THR is stored in a programmable register so 
that it could be adjusted to accommodate the sensor calibration. For an ideal balanced 
incoming signal DSC that spans the full 8-bit luminance range, THR would be normally 
set to THR = 128 in order to bias the luminance of DSC such that the output signal GRS 
5 will have zero mean and the incoming signal will be thresholded halfway across. 
However, the visual threshold may well be skewed away from the center due to the 
logarithmic response of the human visual system. 

In addition, the scanner response may vary across the dynamic range, or may 
not even span the full 8-bit range. For example, the peak luminance value is 
10 determined by the brightest media reflectance, and the dark current of the sensor 
determines the output at low light levels. The value of the threshold register THR can 
be appropriately adjusted to account for the above considerations and better match the 
desired GRS response. In any case, only the luminance component of the incoming 
signal DSC is used for this biasing. 

15 The logic block 820 is used to address the 9x9 coarse resolution window context 

W2 shown in FIG. 7. The inputs to the logic block 820 are the full-color coarse 
minimum value MN and maximum value MX from the Dependent Max and Min blocks 
E3 and E4, respectively. Recall that these values were generated by sub-sampling the 
outputs of the Dependent Min-Max block E1 by a factor of 8 in both directions and then 

20 searching for the minimum and maximum (i.e., minimum of minima and maximum of 
maxima) over a 9x9 window. The operation of the logic block 820 is equivalent to 
performing the scaled dot product of the following two vectors X and Y: 

output 828 = < X , Y>; (1) 

where < X , Y > is the scaled dot product of the two vectors X and Y: 

25 < X , Y > = (X L , Xa, X b ) (Y L , Y a , Y b ) x = X L Y L + X a Y a + X b Y b ; (2) 

where 
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and 

Y = DSC-(MX + MN)/2 = 

The (L, a, b) values in equation (4) are the corresponding color components of 
5 the incoming signal DSC. The X vector in equation (3) is the vector difference between 
the maximum value MX and the minimum value MN. The Y vector in equation (4) is the 
incoming signal DSC minus the average of the minimum MN and maximum MX values, 
the average being the 3D midpoint between MN and MX. By taking the scaled dot 
product of these two vectors, the output is proportional to the relative distance from the 
10 plane that is perpendicular to the X vector and crosses it halfway along. Since the 
sought-after information is the location of the zero-crossing, the precise magnitude of 
the dot product is not required. Therefore, the result is divided by an arbitrary factor of 
256 (shift right by 8) to scale it back to fit the 8-bit range. 

However, since the logic block 820 output (to multiplexer 848) may still 
15 occasionally overflow the 8-bit range (by a factor of roughly 3, or 1.5 bits), additional 
logic may be used to limit the logic block 820 output to 255 if it gets larger than 255. 

A scalar measure for the overall contrast magnitude X9 within the coarse 
resolution 9x9 window is generated by adding together the absolute values of the three 
components of the vector X within the summation block 829: 

20 X 9 = L x + I a x I + I b x I = L M x - Lmn + I a MX - a MN I + I br^x - b M N I ; (5) 

Referring to equation (5), there is no need to take the absolute value of the 
luminance component L since L is confined to the positive range [0..255]. The 
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a-(a MX +a MN )/2 
b-Cb^+b™)^] 
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implementation of equations (1) through (5) for the logic block 820 is straight-forward. 
Referring to logic block 820 in FIG. 8, the first two adders 821, 823 perform the vector 
sum and difference of the 3x1 input signals MX, MN, on a component by component 
basis. The adder 821 that handles the sum also divides the result by 2 (by shifting it 
5 right by 1 position) to obtain the average as indicated by the symbol 1/2. Adder 823 
outputs the vector difference X (defined in equation (3)) to block 829. Block 829 
computes the sum of absolute values of the three components of the vector X and 
generates the contrast magnitude X 9 . Adder 825 calculates the vector Y in equation (4) 
by performing the vector difference between the input signal DSC and the output from 
10 adder 821. The X and Y vector components are then multiply-and-added together, 
element by element, to form the dot product in the dot product block 827. The output 
828 of block 827 is described by equations (1) and (2). 

FIG. 16 is a graphical illustration of equations (1) through (4). In FIG. 16, the 
origin of the three-dimensional space is assumed to be on the left hand side, as shown. 

15 The vectors MX, MN represent the three-dimensional points MX and MN, respectively. 
The current image pixel is represented by the vector DSC. . As illustrated, the vector X 
= (MX - MN) and the vector Y result from vector operations in accordance with 
equations (3) and (4). The value di represents the result of taking the dot product of X 
and Y. This value is the projection of Y onto X. This value also indicates the distance 

20 and "direction" of the point represented by the vector Y with respect to the plane Pi. 
The plane Pi is orthogonal to the vector X = MX - MN at the midpoint of X. By 
"direction" of the point represented by vector Y, it is meant whether this point is above 
or below the plane Pi. The plane P t represents the threshold plane. The indicator 
value indicates whether, after thresholding, the current image pixel DSC is above or 

25 below the threshold plane, that is, whether it is closer to MX or to MN, and by how 
much. This indicator value allows a decision to be made regarding segmentation of 
the current pixel. For example, if the thresholded pixel is very close to MX (respectively, 
MN), a decision can be made that the current pixel be included in the Foreground plane 
(respectively, Background plane). If the thresholded pixel is too close to the threshold 
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plane, a decision can be made that the current pixel be included in both the Foreground 
and Background planes. 
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Referring to FIG 8, the logic block 830 is used to address the 5x5 high-resolution 
window context W1 shown in FIG. 7. The inputs to the logic block 830 are the full-color 
minimum and maximum values Mn, Mx from the Dependent Min-Max module E1 . The 
operation of the logic block 830, in forming a scaled dot product, is similar to the logic 
block 820 described above. 

The operation of the logic block 830 is equivalent to performing the scaled dot 
product of the following two vectors: 

output 838 = < X' , Y' > ; (6) 

where < X' , Y f > is the scaled dot product between the two vectors X' and Y f : 

< X' , Y f >= (X L ', Xa', X b *) (Y L ', Y a ', Y b t = X L ' Y|_* + Xa' Y a ' + X b * Y b ' ;(7) 

where 



X' = Mx - Mn = 



a Mx " 3 Mn 
j>Mx ~ b Mn 



(8) 



15 and 



20 



Y' = DSC-[BLR_A + (Mx + Mn)/2]/2 = 



L-[L A +(L Mx +L Mn )/2]/2 
a-ta A +(a Mx +a Mn )/2]/2 
b-[b A +(b Mx +b Mn )/2]/2j 



I O) 



The (L, a, b) values in equation (9) are the corresponding color components of 
the incoming signal DSC. The X' vector In equation (8) is the vector difference between 
the maximum vector Mx and the minimum vector Mn. The Y f vector in equation (9) is 
the incoming signal DSC minus the average of the minimum Mn and maximum Mx 
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values, the average being the 3D midpoint between Mn and Mx. By taking the scaled 
dot product of these two vectors, the output is proportional to the relative distance from 
the plane that is perpendicular to the X' vector and crosses it halfway along. Since the 
sought-after information is the location of the zero-crossing, the precise magnitude of 
5 the dot product is not required. Therefore, the result is divided by an arbitrary factor of 
256 (shift right by 8) to scale it back to fit the 8-bit range. 

However, since the logic block 830 output (to multiplexer 848) may still 
occasionally overflow the 8-bit range (by a factor of roughly 3, or 1 .5 bits), additional 
logic may be used to limit the logic block 830 output to 255 if it gets larger than 255. 

10 A scalar measure for the overall contrast magnitude X5 within the fine resolution 

5x5 window W1 (FIG. 7) is generated by adding together the absolute values of the 
three components of the vector X' within the summation block 839: 

Xs = L>c + I a x « I + I b)c I = L Mx - L Mn + I a Mx - a Mn I + I bMx - bMn I; (10) 

Referring to equation (10), there is no need to take the absolute value of the 
luminance component L since L is confined to the positive range [0..255]. The 
implementation of equations (6) through (10) for the logic block 830 is straight-forward. 
Referring to logic block 830 in FIG. 8, the first two adders 831 , 833 perform the vector 
sum and difference of the 3x1 input signals Mx, Mn, on a component by component 
basis. The adder 831 that handles the sum also divides the result by 2 (by shifting it 
right by 1 position) to obtain the average as indicated by the symbol Z/2. Adder 833 
outputs the vector difference X* (defined in equation (8)) to block 839. Block 839 
computes the sum of absolute values of the three components of the vector X' and 
generates the contrast magnitude X5. Adder 834 adds the vector signal BLR_A to the 
vector output of adder 831 and divides the result by 2. Adder 835 calculates the vector 
Y' in equation (9) by performing the vector difference between the input signal DSC and 
the output from adder 834. The X' and Y* vector components are then multiply-and- 
added together, element by element, to form the dot product in the dot product block 
837. The output of block 837 is described by equations (6) and (7). 
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It is important to note that the architecture of logic block 830 differs from that of 
logic block 820 by having the added threshold-biasing feature that enhances dark or 
light thin lines by "nudging" the threshold towards the Super-Blur reference signal 
BLR_A = (La, a A , b A ) when BLR_A is available. This is accomplished by averaging the 
5 Super Blur signal BLR_A with the averaged Mx and Mn values, to form the alternative 
Y" vector, as shown in equation (9). 

FIG. 17 is a graphical illustration of equations (6) through (9). In FIG. 17, the 
origin of the three-dimensional space is assumed to be on the left hand side, as shown. 
The vectors Mx, Mn represent the three-dimensional points Mx and Mn, respectively. 

10 The current image pixel is represented by the vector DSC. The Super-Blur reference 
signal BLR_A is represented by the vector BLR_A. As illustrated, the vector X' = (Mx - 
Mn) and the vector Y' result from vector operations in accordance with equations (8) 
and (9). The value d 2 represents the result of taking the dot product of X' and Y\ This 
value is the projection of Y" onto X'. This value also indicates the distance and 

15 "direction" of the point represented by the vector Y 1 with respect to the plane P 2 . The 
plane P 2 is orthogonal to the vector X = MX - MN at a point away from the midpoint by a 
small amount. This amount represents the added threshold-biasing feature discussed 
in the preceding paragraph. By "direction" of the point represented by vector Y\ it is 
meant whether this point is above or below the plane P 2 . The plane P 2 represents the 

20 threshold plane. The indicator value d 2 indicates whether, after thresholding, the 
current image pixel DSC is above or below the threshold plane, that is, whether it is 
closer to Mx or to Mn, and by how much. This indicator value d 2 allows a decision to be 
made regarding the segmentation of the current pixel. For example, if the thresholded 
pixel is very close to Mx (respectively, Mn), a decision can be made that the current 

25 pixel be included in the Foreground plane (respectively, Background plane). If the 
thresholded pixel is too close to the threshold plane, a decision can be made that the 
current pixel be included in both the Foreground and Background planes. 
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Referring to FIG 8, the decision module 840 receives the output 818 from logic 
block 810, output 828 and contrast magnitude output X 9 from logic block 820, output 
838 and contrast magnitude outputX 5 from output block 830. 

The decision module 840 comprises a comparator logic 846, a multiplexer 848, 
5 an enhancement coding block 850, and a comparator 852. The decision module 840 
also includes two parameterized piecewise linear function blocks 842 and 844 to 
process the halftone weight signal HTW when it is available from a de-screener system. 

The comparator logic 846 receives the contrast magnitude outputs X 5 and X 9 , 
outputs the select signal SEL to control the output GRS of multiplexer 848, and outputs 
10 the enable signal EN A to control the enhancement signal ENH of the enhancement 
logic 850. The comparator logic 846 may also use the 8-bit Halftone Weight frequency 
estimate HTW, when available, from a de-screener system, after the estimate HTW has 
passed through a parameterized piecewise linear function block 842. 

Note that, by definition of the min-max operations described previously, the 
15 contrast magnitude of the larger 9x9 (sub-sampled) window W2 must be equal or larger 
than the contrast magnitude of the smaller 5x5 high-resolution window W1 . In other 
words: 

This is due to the fact that, for a larger window that includes a smaller one, the 
20 maximum can only be larger and the minimum smaller than those of the smaller 
window. Furthermore, as the segmentation process proceeds from one pixel to the next 
(in the fast scan direction), the X 9 contrast value remains the same for 8 consecutive 
pixels until the next pixel crosses the 8x8 window boundaries into the next non- 
overlapping window. The Xs contrast value, on the other hand, may change on a pixel 
25 by pixel basis. This behavior is due to the 8x sub-sampling performed by the 
Dependent Min-Max Sub-Sample block E2. 
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FIG. 9 shows the block diagram of an embodiment of the comparator logic 846. 
The two contrast magnitude measures X 5 and X 9 are compared to the signal STH, via 
comparators 904, 902, respectively, to generate the selection bits SELO and SEL1, 
respectively. The bits SELO and SEL1 form the 2-bit select signal SELECT. If the 
5 halftone weight HTW is available, HTW is passed through the piecewise linear function 
block 842 to produce STH. Otherwise, STH is set to a predetermined value. The two 
bits SELO and SEL1 are then combined together by AND gate 906 to generate the 1-bit 
enhancement enable signal ENA. 

FIG. 10 shows the equivalent Truth Table for the comparator logic. If the 
10 contrast measure X 9 of the larger 9x9 (sub-sampled) window W2 is smaller than STH, 
then, regardless of the contrast measure X 5 of the smaller 5x5 window W1 , the SEL1 bit 
is cleared and the SELECT signal is either 0 or 1 . This causes the multiplexer 848 to 
select the Single Pixel context output 818 (FIG. 8). If, however, there is some activity in 
the larger 9x9 window W2 but not within the smaller 5x5 window W1, the SELECT 
15 signal is set to equal 2 (binary "10"). This causes the multiplexer 848 to select in the 
logic block 820 output 828. If both windows show significant contrast magnitude, the 
SELECT signal is set to 3, resulting in the output 838 of logic block 830 (corresponding 
to the 5x5 high-resolution window) being selected by the multiplexer 848. In addition, 
when the SELECT signal is 3, the binary enable signal ENA is turned on. The signal 
20 ENA is used to enable the enhancement block 850 to output the segmentation 
enhancement signal ENH. 

Referring to FIG. 8, the enhancement coding block 850 also uses a linear 
function of the Halftone Weight frequency estimate HTW to produce the signal ENH 
which controls the amount of segmentation enhancement to be applied in the FG/BG 
25 Separation block E8 (FIG. 3). The HTW signal is fed to the parameterized piecewise 
linear function block 844 which applies a piecewise linear function EEN to the signal 
HTW, and outputs the resulting signal to the enhancement coding block 850. The 
binary enhancement enable signal ENA from the comparator logic 846 is used for 
gating (i.e., enabling) the enhancement signal ENH as follows. If ENA = 1, then the 
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block 844 output signal is passed through to the output ENH; otherwise, all of the ENH 
bits are forced to zero (disabled). The 8-bit ENH output signal communicates the 
amount of segmentation enhancement to be applied in the FG/BG Separation block E8. 

Referring to FIG. 8 f the comparator 852 compares the optional halftone weight 
5 HTW with a predefined threshold THRseg and generates the optional binary signal SEG. 
The optional binary signal SEG provides a means to control the segmentation function 
of the FG/BG Separation block E8. The binary signal SEG is turned on when HTW is 
larger than THRseg- When SEG is on, the segmentation in the FG/BG Separation block 
E8 is performed in accordance with equations (14), (15), (16), (17). The binary signal 
10 SEG is turned off when HTW is smaller than or equal to THRseg- When SEG is off, the 
segmentation in the FG/BG Separation block E8 is performed in accordance with 
equations (18), (19), (20). The value of the 8-bit THRseg may be programmable via a 
register. 

Referring to FIG. 3, the Selector Logic block E6 converts the 8-bit grayscale 
15 selector input GRS to the binary Selector plane output SEL which could be at a different 
resolution relative to the input image signal DSC. For high-quality text and line art, the 
binary Selector output SEL is typically kept at a higher resolution than the Background 
and Foreground planes. For example, the embodiment 300 uses twice the resolution, 
so that, for a standard 600-dpi scanner, the output resolution of the binary SEL signal is 
20 normally set at 1200 dpi. The Selector Logic block E6 interpolates the gray selector 
input GRS to the higher resolution and thresholds the resulting interpolated signal to 
generate the binary output SEL which forms the Selector plane. 

FIG. 1 1 shows the block diagram of the Selector Logic block E6. The Selector 
Logic block E6 comprises an interpolator 1 1 02 and a comparator 1 1 04. The interpolator 
25 1102 interpolates the 8-bit gray selector input signal GRS in the gray domain using a 
two-dimensional bilinear interpolation scheme. The interpolation factor may be 
programmable under software control. In the embodiment 1102 shown in FIG. 11 (and 
FIG. 3), the interpolation factor is set to the default setting of 2x. The output from the 
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interpolator 1102 is fed to the comparator 1104, which thresholds it against a value 
TSL, and outputs the resulting binary output as the Selector signal SEL. The value of 
the threshold TSL is stored in a programmable register so that that it can be changed 
from one page of a document to the other. For an ideal signed gray selector signal 
5 GRS that spans the full 8-bit luminance range, the value TSL would be normally set to 
the mid-point TSL = 0 to threshold the incoming signal halfway across. However, as 
explained previously for the signal THR in block 810 (FIG. 8), the actual TSL value may 
be set differently in order to compensate for the actual dynamic range of the scanner 
sensor and/or the properties of the human visual system. 

10 Note that for the current 2x interpolation-factor, the binary SEL output data rate is 

twice as fast as the image signal DSC in each direction. This means that for every 8-bit 
input GRS sample, the Selector Logic block E6 produces four binary output SEL 
samples. 

It is important to note that a vector notation is used for the output SEL to indicate 
15 the higher output resolution. While the output SEL is still considered to be binary (that 
is, assuming the values of either 0 or 1 only), each incoming GRS input generates four 
Selector bits at the output (assuming the normal 2x interpolation factor). The output 
data rate is half the input rate, since one 8-bit value is converted to four 1-bit values. 

Referring to FIG. 3, the Edge Processing block E7 receives the high resolution 
20 Selector output SEL and counts the number of on and off pixels in a 5x5 high-resolution 
window centered on the current pixel of interest. The Edge Processing block E7 
outputs the two-bit signal SEE. The SEE signal is set to 0 if all of the input pixels inside 
the window are 0 (corresponding to a constant 3x3 original input resolution Background 
area). Similarly, the SEE signal is set to 3 if all of the input pixels inside the window are 
25 1 (corresponding to a constant 3x3 original input resolution Foreground area). In 
addition, the SEE output is set to either 1 or 2 if the content of the 3x3 window is mostly 
Background (white) or mostly Foreground (black), respectively. 
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FIG. 12 illustrates the function of the Edge Processing block E7. The operation 
of block E7 is as follows. The Edge Processing block receives as input th binary 
Selector signal SEL, which is at higher resolution (normally 2x) relative to the source 
input signal DSC. The Edge Processing block E7 maintains a 3x3-pixel context window 
5 W3 (at the original input resolution) centered at the current pixel of interest. Since the 
Selector signal SEL is at twice the original resolution, each of the elements in the 3x3- 
pixel window W3 contains four of the high-resolution binary SEL samples as shown in 
FIG. 12. The thicker lines represent the original DSC resolution, which corresponds to a 
6x6-pixel context window in the high-resolution domain. However, only the dot content 
1 0 of 5x5 high-resolution pixel area W4 (shown in dashed lines) is used; the area between 
W3 and W4 is excluded from the edge processing. 

The 5x5 high-resolution context W4 is designed for "detecting" potential edges in 
the proximity of the current pixel of interest. The window pattern W4 uses a full context 
of 2 high-resolution pixels extending below and to the right of the current pixel, but only 
15 one high resolution pixel extending to the top or to the left. This window pattern 
prevents any edge from overlapping with the neighboring pixels, that is, no potential 
edge location could be detected (i.e., shared) more than once, in association with 
multiple lower resolution pixels. The 4x4=16 possible edge locations within the current 
window of interest are indicated in FIG. 12. 

20 The Edge Processing block E7 counts the number of high-resolution pixels that 

are currently turned on in the 5x5 high-resolution area W4. This number can range from 
0 to 25. It is mapped to the output 2-bit signal SEE of the Edge Processing block as 
following: 

1 . SEE = 0 if the 5x5 count was 0 (no Foreground pixels found); 
25 2. SEE = 1 if the count was in the range [1 ..12] inclusive (mostly Background pixels); 

3. SEE = 2 if the count was in the range [13..24] inclusive (mostly Foreground pixels); 

4. SEE = 3 if the count was 25 (only Foreground pixels found); 
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The 2-bit output signal SEE is forwarded to the FG/BG Separation block E8. It is 
noted that this signal is at the original input resolution (typically, 600 dpi), although the 
Selector signal is at the higher resolution of double the original input resolution (for this 
example, 1200 dpi). 

The FG/BG Separation block E8 splits the incoming image signal DSC into the 
Foreground and Background planes. The block uses the full-color minimum and 
maximum (Mn, Mx) outputs from the Dependent Min-Max block E1, as well as the 2-bit 
Selector edge count signal SEE from the Edge Processing block E7. In addition, the 
FG/BG Separation block can also enhance the lifting of edges via the segmentation 
enhancement control signal ENH received from the Dynamic Threshold Module 320. 
As discussed previously, the segmentation process in the FG/BG Separation block E8 
can be controlled by the optional binary signal SEG. 

The FG/BG Separation block E8 outputs two full-color rough initial estimates Fgr 
and Bgr of the Foreground and Background outputs, respectively. The subsequent 
15 module, the FG/BG Cleanup block E9, further processes the estimates Fgr and Bgr to 
produce the final Foreground and Background outputs. 

The FG/BG Separation block E8 takes in the full-color source signal DSC to be 
segmented and produces values for one of the Fgr and Bgr outputs, or sometimes for 
both. The FG/BG Separation block E8 reserves the special code of zero luminance and 

20 chroma (L = a = b = 0) to indicate an empty (undefined) pixel in either the Foreground 
estimate Fgr or the Background estimate Bgr. As the process continues across the 
page, some of the Foreground and Background pixels will remain undefined. The 
FG/BG Cleanup block will then carefully fill in the values for these undefined pixels to 
keep the compression low and to prevent potential subsequent sub-sampling and JPEG 

25 compression artifacts. 

FIG. 13 illustrates the decision range used by the FG/BG Separation block E8 for 
separating the image signal into the Background and Foreground planes. The FG/BG 
Separation block uses the value of the 2-bit Selector edge count signal SEE from th 
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Edge Processing block E7 to determine whether to use the Background, Foreground, or 
both. The Foreground is used if SEE £ 1, and the Background is used if SEE £ 2. 
Since these two cases are not mutually exclusive, both the Foreground and Background 
are used simultaneously when 1 £ SEE < 2. 

5 The following two full-color quantities (the purpose of which will be detailed 

below) are defined: 

FGD = LIM[DSC + (Mn-DSC)(ENH/255)]; (12) 

BGD = LIM[DSC + (Mx-DSC)(ENH/255)]; (13) 

where DSC is the full-color input signal, Mn and Mx are the dependent minimum and 
10 maximum outputs from the Dependent Min-Max block E1, and ENH is the 8-bit 
segmentation enhancement signal from the Dynamic Threshold Module 320. The limit 
function LIM is used to restrict the result to the 8-bit range [1..255] for each of the 
components, thus excluding the special code of zero that is reserved for marking 
undetermined pixels. Note that since DSC and Mn or Mx are full-color (L, a, b) vectors, 
15 the operations are in 3D space. 

For the case of using the Foreground during segmentation, that is, when SEE = 
{1 , 2, 3} and SEG = 1 , the output Fgr value is determined to be: 

Fgr = FGD , if SEG=1 AND SEE = {1 , 2, 3} ; (14) 

Fgr = 0 if SEG=1 AND SEE = 0 ; (15) 

20 When the control signal SEG is on (SEG=1) and the Foreground is not used (i.e., 

when SEE = 0), the Foreground pixel value is marked as undefined by setting its value 
to the special code Fgr = 0 (for all of the three components), per equation (15). Note 
that, in one implementation, ENH is extended to a 9-bit representation and its value is 
incremented by 1 to allow normalization by 256 instead of 255, thus replacing the 

25 unnecessary division with a simple shift right by 8 positions. 
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As shown in equations (14) and (12), the Foreground estimate Fgr value is 
interpolated (in 3D space) between the current input signal value DSC and the minimum 
value Mn of the Dependent Min-Max block E1 t depending on the amount of 
segmentation enhancement expressed by the 8-bit signal ENH. When ENH = 0, no 
5 enhancement will be done and the output will be set to the input signal Fgr = DSC. As 
previously discussed, this would be the common case unless there would be sufficient 
contrast activity in both the 5x5 fine resolution window W1 and 9x9 coarse resolution 
window W2. Equivalently, when ENH = 255 (corresponding to maximum 
enhancement), the output will be set to the minimum signal Fgr = Mn. This would 

10 normally represent the case of a pixel in the immediate vicinity of an edge, where it 
would be advantageous to enhance the edge by painting the Foreground darker by as 
much as possible, as given by the minimum value in the neighborhood (0=black). In 
general, the amount of segmentation enhancement ENH can vary between the above 
two extremes, and the output Foreground values will be correspondingly weighted 

1 5 between the DSC and Mn values. 

Similarly, for the case of using the Background during segmentation, i.e., when 
SEG = 1 and SEE = {0, 1, 2}, the output Bgr value is determined by the following 
equations: 

Bgr = BGD if SEG=1 AND SEE = {0,1,2}; (1 6) 

20 Bgr = 0 if SEE = 3; (17) 

The output Bgr value will vary between the input DSC and Mx values in 
proportion to the amount of segmentation enhancement ENH, as given by equation 
(13). Equation (16) is similar to equation (14), with the exception of using the maximum 
Mx instead of the minimum Mn (see also equations (12) and (13)), and the different 
25 range of SEE. The use of Mx for the Bgr output will make it become visually lighter 
rather than darker as was the case for the use of Mn for the Foreground estimate Fgr. 
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Also, as indicated by equation (17) and corresponding with (15), when the 
Background is not used (i.e., when SEE = 3), the Background value is marked as 
undefined by setting its value to the special code Bgr = 0 (for all of the three 
components L, a, b). 

5 In the case where the optional segmentation control is turned off (i.e., SEG = 0), 

the Background is arbitrarily set to be a special code, regardless of the input pixel value: 

Bgr = 0, if SEG = 0 ; (18) 

and the Foreground is set to be one of the following: 

Fgr = BGD , if SEG=0 AND SEE = {0, 1} ; (19) 

1 0 Fgr = FGD , if SEG=0 AND SEE = {2, 3} ; (20) 

Equations (18), (19), (20) show that the Background is always set to the special 
undefined marker Bgr = 0, and the Foreground takes on the value of the weighted 
Foreground or Background pixel of (12) or (13), depending on the value of SEE. If SEE 
= {0,1}, the Background is used, otherwise the Foreground used. 

15 It is important to note that the range of SEE for this case, where the optional 

segmentation control is turned off, is different than what was used in equations (14) 
through (17). Here the Foreground value is simply set to FGD if most of the pixels are 
foreground ( SEE ={0, 1} ), or to BGD if they are mostly background ( SEE ={2, 3} ). 

The output from the FG/BG Separation block E8 is the two partially filled full-color 
20 planes Fgr and Bgr. Away from the edges in the Selector plane SEL, typically only one 
of the Foreground or Background outputs will contain the current pixel color (possibly 
enhanced), depending on whether it was light or dark. But near the edges, however, 
information might be carried in both the Foreground and Background channels 
simultaneously. 
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The FG/BG Cleanup block E9 applies additional processing on the rough 
Foreground and Background estimates Fgr and Bgr to generate the final Foreground 
and Background outputs FG and BG. The processing done by the FG/BG Cleanup 
block E9 is to slightly extend the Foreground and Background values beyond edges and 
5 also fill the undefined pixels in the Foreground and Background planes. The purpose of 
this processing is to prevent artifacts that may result from a subsequent sub-sampling 
and data compression (such as JPEG) and fill in the undefined pixels to achieve good 
compression ratio. 

An additional logic inside the FG/BG Cleanup block E9 also monitors the 
10 Foreground and Background output values to detect and flag constant all-black or all- 
white tiles. Rather than encode the output from such tiles into the output data stream, a 
special tile marker is used and referenced as many times as needed. This will increase 
the overall compression ratio by eliminating the need to repeatedly encode the common 
all-white or all-black tiles. 

15 FIG. 14 shows a block diagram of an embodiment of the FG/BG Cleanup block 

E9. The embodiment comprises two Dilate blocks F1, F2, two Averaging blocks F3, F4, 
two Fill / Sub-Sample blocks F5, F6, and a Tile Tag block F7. 

The Dilate blocks F1 and F2 extend the defined content of the Foreground and 
Background inputs by two pixels. The extension by 2 pixels is done in anticipation of a 
20 subsequent 2x sub-sampling; other extension sizes could be used for different sub- 
sampling factors. The purpose is to increase the Foreground and Background overlap 
over edges so that the outcome is less sensitive to further sub-sampling. 

FIG. 15 illustrates the effect of the dilate operation of Dilate blocks F1 and F2. 
The area 1504 represents the defined pixels. The area 1502 represents the undefined 
25 pixels. After the dilate operation, the area of defined pixels is extended to cover also 
area 1506, and the area of undefined pixels is reduced to area 1508. 
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The operation of the Dilate blocks is achieved by using lowpass filters Fz_n. The 
n suffix indicates the number of the filter coefficients. In one embodiment, two- 
dimensional 5x5 triangular filters Fz_5 are used. The z suffix indicates that such filter 
eliminates any zero terms from the total normalization weight. 

5 The general Fz_n filter equation is as follows: 

output value = * Wy - ; (21) 

I>ii* 8 ( a ii> 

where ay are the 2D input values, wy are the 2D filter coefficients, and 5(a«j) is defined as 
follows: 

8(aij) = 1 if a* * 0 ; otherwise 5(aij) = 0 ; (22) 

10 As can be seen from equation (21), an Fz_n filter differs from a regular filter in 

that the total weight is no longer just a known normalization constant. Since the number 
of incoming pixels having the value of zero is not known in advance, an ongoing 
accumulator for the weight must be kept. At the same time that the filter loop is 
operating, the content of the accumulator is increased by the value of the current filter 

15 coefficient if the corresponding input value is non-zero. Since the total weight is not 
fixed and known in advance, the final normalization of the filter output is dependent on 
the value of the total weight. However, it is still possible to avoid the division operation 
in equation (21) by using a predetermined multiplication table with multiple choices for 
the possible total weight values. 

20 The purpose of using the special Fz_n filters is to eliminate any pixels marked as 

undefined with the special zero marker from contaminating the filtered output. Since the 
special marker value was selected to be zero, the undefined pixels do not affect the 
numerator in (17). However, the number of these undefined pixels must be tracked in 
order to keep the denominator accurate. 
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Whenever the filtered output of the Fz_J> filter in the Dilate block F1 (or F2) is 
non-zero (that is, at least one pixel within the 5x5 window center d on the current pixel 
of interest is non-zero, or fully defined), it replaces the special zero marker used for the 
undefined pixels. Note that the filtered outputs of the Fz_5 filters in the dilate blocks F1 , 
5 F2 are only used for previously undefined pixels, and only affecting pixels within a two- 
pixel area in the immediate vicinity of an edge. 

Referring to FIG. 14, the Averaging blocks F3 and F4 average the non-zero 
content of the diluted Foreground and Background values over JPEG MCU (Minimum 
Coded Units) blocks. These blocks have a size of 16 x 16 for a non sub-sampled 
output, and 32 x 32 if the output is sub-sampled by a factor of 2 in each direction (other 
sizes may be used for different sub-sampling factors). Again, the averaging is carefully 
done to exclude any undefined pixels that might otherwise bias the result. The 
averaging operation could be viewed in a similar manner to using Fz_n filters but with 
constant unity weight rather than triangular shapes. The Block Averages are performed 
on the fixed non-overlapping JPEG MCU block grid. 

The Fill / Sub-Sample blocks F5 and F6 substitute the averaged Background and 
Foreground values received from the F3 and F4 units for the values of any remaining 
undefined pixels in the diluted Background and Foreground channels, respectively. 
With these substituted averaged values, the compression ratio is improved and further 
20 JPEG-ringing artifacts are largely prevented. Since the Background and Foreground 
channels are typically sub-sampled, the Fill / Sub-Sample blocks F5 and F6 also 
perform the sub-sampling operation as needed, and output the Background and 
Foreground signals BG, FG. 

The Tile Tag block F7 monitors the values of the Foreground and Background 
25 channels over tiles and flags any tiles that effectively have all white or all black values. 
The tile sizes are programmable, but typical values vary from 64 x 64 to 512 x 512 
pixels. A tile is considered to be all white if the absolute chroma values (i.e., both lal 
and Ibl) of each and every pixel within the tile are smaller than given thresholds, and the 
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luminance is larger than a given threshold. Similarly, a tile is considered to be all black 
if the absolute chroma values (i.e., both lal and Ibl), as well as the luminance value, are 
all smaller than the given thresholds. The three thresholds above may be programmable 
using registers. 

5 The following describes how the optional signal HTW can be generated from a 

Halftone Estimator module. 

The Halftone Estimator module is used for measuring the frequency and halftone 
weight in a small area surrounding the current pixel of interest. The inputs to the 
Halftone Estimator module are the source signal SRC and an output BLR_3 from a 
10 small lowpass filter (e.g., having 3 coefficients). Both of these inputs are full-color (L, a, 
b) signals. 

The Halftone Estimator module produces two monochrome (single-channel) 
output signals, FRQ and HTW, representing the estimated frequency and halftone 
weight, respectively. Each of these signals is represented by an 8-bit representation. 
15 HTW represents the level of confidence in the area being a halftone. If HTW is small 
(low confidence), active segmentation is turned off to prevent lifting of each and every 
halftone dot. 

FIG. 18 is a block diagram of an embodiment of the Halftone Estimator module. 

20 As shown in FIG. 18, the embodiment of the Halftone Estimator module 

comprises two separate frequency detection channels operating concurrently, the 
outputs of which are combined together at the very last step to generate the halftone 
weight. Each frequency channel comprises a Min-Max texture detector (C2 and D2), 
followed by cascaded averaging filters. Most of the averaging filters also sub-sample 

25 the data by a factor of 2 at the same time (that is, only produce every other filtered 
value), so that the peak bandwidth is greatly reduced. Only at the very end the data is 
up-sampled back to the original resolution. 
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The notation in FIG. 18 is intended to emphasize the similarity between the two 
frequency channels by drawing the various blocks as matching pairs connected by the 
dotted lines (there is no actual connection). 

Matching pairs of blocks are given the same block number, with the initial letter C 
5 for the original frequency channel and D for the blurred frequency channel. Matching 
pairs are therefore designated as (Cn, Dn), with n=[2,...,9]. 

The notation used for the various block names is as following: the first number 
(underscored) denotes the size of the window used; the second number following the 
slash denotes the amount of sub-sampling that is done inside the block in each 
10 direction. Thus, for example, a filter designated BJ&/2 indicates a blurring (lowpass) 
filter B with 3x3 window size, whose output is sub-sampled by a factor of 2 in both 
directions (that is, only one output is sent for every 2x2 = 4 input pixels). 

The following provides a more detailed description of the various blocks included 
in the Halftone Estimate module. 

15 The Sub-Sampling unit D1 is reducing the data rate by a factor of 4 by omitting 

every other pixel and line of the input. The sub-sampling is only needed in the blurred 
frequency channel D. No sub-sampling is needed for the full-resolution frequency 
channel C. The input to the SS/2 unit is the full-color (L, a, b) output signal BLR_3 from 
a small lowpass filter. The output of the SS/2 unit is a full-color (L, a, b) signal. The 

20 preferred sub-sampling method is to apply a simple pre-filter prior to sub-sampling in 
order to eliminate potential frequency aliasing problems. 

The two identical Min-Max Detection modules C2, D2 are used for finding peaks 
and valleys in the input signal. By counting the number of peaks and valleys per unit 
area, a measure of the local frequency is obtained. 

25 Each of the two Min-Max Detection units is taking as input a full-color (L, a, b) 

signal. Each unit utilizes three 3x3 windows to indicate when the center pixel of one of 
the color components is at an extreme value (either peak or valley) relative to its 8 
neighbors, in accordance with the logic described below. 
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Each color component is independently examined in its own 3x3 window. The 
output from each Min-Max Detection unit is a 2-bit signal indicating the total number of 
color components that are in an extreme value state. This number could vary from zero 
(no color component was at an extremum) to 3 (all color components are at an 
5 extremum). When one or two color components are at an extremum, no distinction is 
made as to which ones are at an extremum; only the total number of components at an 
extremum is outputted. 

FIG. 19 shows the Min-Max Detection structure. For each color component, the 
outer ring of 8 pixels surrounding the center pixel (the current pixel of interest) is first 
10 analyzed. The 8 outer pixels are further divided into two sets of 4 pixels each as shown 
in FIG 19. The partitioning of the outer ring into two sets is useful for reducing the 
likelihood of false alarms in detecting straight-line segments as halftones (since most 
commonly encountered halftones are typically classified as clustered dots or line- 
screens). 

15 For each set, the pixel values are compared among the members of the set to 

determine the minimum and maximum values within each set independently: 



Amax = max ( Aij ) ; over all (i, j) belonging to the set A ; (23) 

Amin = min ( Aij ) ; over all (i, j) belonging to the set A ; (24) 

B max = max ( Bij ) ; over all (i, j) belonging to the set B ; (25) 

20 B min = min ( Bij ) ; over all (i, j) belonging to the set B ; (26) 

Then the overall outer ring contrast is computed from the above values: 

Anng = max ( Amax, B max ) - min ( A mln , B mjn ) ; (27) 

The value of A ring is next tested to see if there is any contrast for the outer ring. 
Regardless of the center pixel value, the output is set to zero (not an extremum point) if 
25 the value of A ri ng is less than or equal to a pre-defined small threshold T2: 

If (A ri ng < T2 ) , return (0) ; (28) 
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If, on the other hand, there is sufficient activity in the outer ring (as indicated by outer 
ring contrast > T2 ), then two tests are made to see if the center pixel value is at an 
extreme value relative to the outer ring values. The center pixel value X is defined to be 
at a peak if it is (significantly) larger than the maximum pixel value of either set: 

5 If [( Amax + S < X) AND ( B max < X )] , return (1) ; (29) 

where S is the outer ring contrast, scaled by a contrast scaling parameter C: 

S=Anng/C; (30) 

In one embodiment, the contrast scaling parameter C is set to equal to 8. The 
actual value of the scaling parameter C is a function of the signal noise level at the 
10 input. It is desirable to maintain the value of C as a generic parameter of the Min-Max 
Detection units. The value of C can be limited to a power-of-two number so that it could 
be implemented as an arithmetic shift to save the need to perform a division operation 
per pixel. 

Similarly, the center pixel value X is defined to be at a valley if it is (significantly) 
15 smaller than the minimum pixel value from either set A or B: 

If [( Amin > X + S ) AND ( B min > X )] , return (1) ; (31) 

Equations (29) and (31) determine the two conditions where the output from the 
3x3 detection window are set to 1 ; in all other cases the output will be set to 0. 

In a second embodiment, the center pixel value X is defined to be at a peak if it is 
20 (significantly) larger than the maximum pixel value of either set: 

If [( Acnax + Nth < X) AND ( Bmax < X )] , return (1) ; (29A) 
where Nth is the noise threshold defined as: 

Nth = NoiseBias + (NoiseFactor x X) / 256 

where the NoiseBias and the NoiseFactor are tuning parameters. 

25 Similarly, the center pixel value X is defined to be at a valley if it is (significantly) 

smaller than the minimum pixel value from either set A or B: 



If [( Amin > X + Nth ) AND ( B min > X )] , return (1 ) ; (31 A) 

Equations (29A) and (31 A) determine the two conditions where the output from 
the 3x3 detection window are set to 1 ; in all other cases the output will be set to 0. Note 
that in this second embodiment, there is no need to compute the overall outer ring 
contrast. 

Finally, each color component is processed independently through its own 
separate 3x3 window. The three binary outputs of the color components are then 
added together to form the final 2-bit output of the Min-Max Detection module. 

The two Min-Max Detection outputs C2 and D2 are fed to the Cascaded Filter 
Chain C3-C6 and D3-D6, respectively. The first filtering units, C3 and D3 are different 
between the two chains, but other than that, the subsequent units C4-C6 and D4-D6 are 
all identical. 

The first Filter unit C3 receives the 2-bit output from the high-resolution Min-Max 
detection unit C2. The input is filtered through the FJ7/4 filter, which, in one 
embodiment, is a 7x7 symmetric, triangular, and separable filter. The /4 notation 
indicates that filter FJ7/4 also sub-samples the filtered output by a factor of 4 in both 
direction. This means that the filter F_7/4 only produces one output pixel for every 
fourth input pixel and every fourth line, thereby effectively reducing the data bandwidth 
by a factor of 16. 

Since the input to the first filtering unit C3 is limited to 2-bits (rather than 8 bits), 
the output of the filter is normalized by scaling the result by a different power-of-two 
number (e.g., 2). The scaling power should remain a design parameter. After the first 
normalization, however, the result has been scaled to fit the range of 8-bits, so that 
subsequent filtering is utilizing the 8-bit representation system from this point forward. 

The second chain leading filter unit D3 is different than C3 in two aspects. First, 
the F_5/2 filter only sub-samples the input by a factor of 2 (rather than 4) in each 
direction. This means that the filter only produces one output pixel for every other input 
pixel and every other line, thereby effectively reducing the data bandwidth by a factor of 
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4. Since the sub-sampling factor is smaller, the span of the filter can consequently be 
reduced from 7 (for C3) to 5 (for D3). In one embodiment, the normalization factor for 
the F_5/2 filter was determined to be 2 9 . Note that the (8-bit wide now) outputs from the 
two leading filter units C3 and D3 are both at the same resolution - sub-sampled by 4 in 
5 both dimension, or 1/16 th of the original input bandwidth. This is because the F__7/4 
filter of the C3 unit in the upper chain is sub-sampling the data by 4, while the 
combination of the SS/2 and F_5/2 unit D3 in the lower chain results in an overall output 
rate that matches the output rate of C3. 

The two outputs from the filtering units C3 and D3 are further filtered through the 
10 three additional and identical units C4-C6 and D4-D6 respectively. Each of the six 
filtering units processes its respective input signal with an F_3/2 filter (with coefficients 
1-2-1) which also sub-samples the data by a factor of 2 in both directions. Note that 
each of these filters has a total weight of 1+2+1=4, thus the implementation can be 
simplified by replacing the normalization division with a simple arithmetic shift right by 2. 

15 Since each of the filtering units is also sub-sampling its respective input signal by 

a factor of 2, the signals at the respective outputs of the C6 and D6 filter units are in 
effect sub-sampled by a factor of 32 in each direction (or 1024 times reduced in 
bandwidth). 

The next two filter units C7 and D7 are special filter units, denoted as a Fz_5 
20 filter. The z suffix indicates a filter that eliminates any zero terms from the total 
normalization weight. The general Fz_n filter equation is as following: 

output value = * g ™ 9 - ; (32) 

where ay are the 2D input values, wy are the 2D filter coefficients, and 8(aij) is the 
function defined as: 

25 5(aij) = 1 if ay * 0 ; otherwise 8(ay) = 0 ; (33) 

As can be seen from equation (32), the difference from a regular filter is in that 
the total weight is no longer just a known normalization constant. Since the number of 
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incoming pixels having the value of zero is not known in advance, an ongoing 
accumulator for the weight must be kept. At the same time the filter loop is operating, 
the content of the accumulator is increased by the value of the current filter coefficient if 
the corresponding input value is non zero. 

5 Since the total weight is not fixed and known in advance, the final normalization 

of the filter output is dependent on the value of the total weight. However, it is still 
possible to avoid the division operation in equation (32) by using a predetermined 
multiplication table with multiple choices for the possible total weight values. 

The purpose of using the special Fz_5 filters is to obtain reliable frequency and 
1 0 halftone weight estimates even when the filter gets very close to an edge. 

The two MX_5 modules C8 and D8 search for the maximum value in a 5x5 
window and output the maximum value. 

Each of the two Interpolation modules C9 and D10 interpolates (i.e., up-samples) 
the signal back by a factor of 32 to restore it to the original resolution. Each interpolation 
15 unit is performing bilinear interpolation, essentially generating 32*32 pixels for each 4 
original pixels. The step size for the bilinear interpolation is 1/32 nd of the original pixel 
grid. 

The Halftone Weight module HTW receives as input the outputs of the two 
interpolation units denoted as FRQ and FRQ_B. The Halftone Weight module adds 
20 together the contribution from each of the input as following: 

HTW = HTW H + HTW L ; (34) 

where 

HTW H = (FRQ - T H ) * SF H if FRQ > T H ; 0 otherwise ; (35) 

HTW L = (FRQ_B - T L ) * SF L if FRQ_B > T L ; 0 otherwise ; (36) 

25 where T H and T L are two predetermined thresholds, and SF H and SF L are two 

predetermined scale factors for the original (high) and filtered (low) frequencies FRQ 
and FRQJB respectively. 



An additional logic ensures that the value of HTW is limited to never exceed the 
allowed 8-bit range of [0, 255]. 

FIG. 20 is a diagram illustrating the equations (34), (35), (36) and the clipping 
effect of the additional logic which limits the value of HTW to the allowed range. The 
5 region denoted as "LA" represents the line-art region. As illustrated in FIG. 20, one 
particular color screen pattern can change from the location indicated as HFHT to 
MFHT to LFHT as its frequency is changed from high to medium to low. Since the 
curve illustrated by the loci on the 2D plot is convex, it is not possible to distinguish the 
screen frequency by observing either FRQ or FRQ_B alone. 

10 In the description above, elements of an embodiment of the invention may be 

implemented by hardware, firmware, software or any combination thereof. The term 
hardware generally refers to an element having a physical structure such as electronic, 
electromagnetic, optical, electro-optical, mechanical, electro-mechanical parts, etc. The 
term software generally refers to a logical structure, a method, a procedure, a program, 

15 a routine, a process, an algorithm, a formula, a function, an expression, etc. The term 
firmware generally refers to a logical structure, a method, a procedure, a program, a 
routine, a process, an algorithm, a formula, a function, an expression, etc that is 
implemented or embodied in a hardware structure (e.g., flash memory, read-only 
memory (ROM), erasable ROM). Examples of firmware may include microcode, 

20 writable control store, microprogrammed structure. When implemented in software or 
firmware, the elements of an embodiment of the present invention are essentially the 
code segments to perform the necessary tasks. The software/firmware may include the 
actual code to carry out the operations described in one embodiment of the invention, or 
code that emulates or simulates the operations. The program or code segments can be 

25 stored in a processor or machine accessible medium or transmitted by a computer data 
signal embodied in a carrier wave, or a signal modulated by a carrier, over a 
transmission medium. The "processor readable or accessible medium" or "machine 
readable or accessible medium" may include any medium that can store, transmit, or 
transfer information. Examples of the processor readable or machine accessible 

30 medium include an electronic circuit, a semiconductor memory device, a read-only 



42 



memory (ROM), a flash memory, an erasable ROM (EROM), a floppy diskette, a 
compact disk (CD) ROM, an optical disk, a hard disk, a fiber optic medium, a radio 
frequency (RF) link, etc. The computer data signal may include any signal that can 
propagate over a transmission medium such as electronic network channels, optical 
5 fibers, air, electromagnetic, RF links, etc. The code segments may be downloaded via 
computer networks such as the Internet, Intranet, etc. The machine accessible medium 
may be embodied in an article of manufacture. The machine accessible medium may 
include data that, when accessed by a machine, cause the machine to perform the 
operations described in the following. The machine accessible medium may also 
10 include program code embedded therein. The program code may include machine 
readable code to perform the operations described in the above description of the 
invention. The term "code" here refers to any type of information that is encoded for 
machine-readable purposes. Therefore, it may include program, code, data, file, etc. 

All or part of an embodiment of the invention may be implemented by hardware, 
15 software, or firmware, or any combination thereof. The hardware, software, or firmware 
element may have several modules coupled to one another. A hardware module is 
coupled to another module by mechanical, electrical, optical, electromagnetic or any 
physical connections. A software module is coupled to another module by a function, 
procedure, method, subprogram, or subroutine call, a jump, a link, a parameter, 
20 variable, and argument passing, a function return, etc. A software module is coupled to 
another module to receive variables, parameters, arguments, pointers, etc. and/or to 
generate or pass results, updated variables, pointers, etc. A firmware module is 
coupled to another module by any combination of hardware and software coupling 
methods above. A hardware, software, or firmware module may be coupled to any one 
25 of another hardware, software, or firmware module. A module may also be a software 
driver or interface to interact with the operating system running on the platform. A 
module may also be a hardware driver to configure, set up, initialize, send and receive 
data to and from a hardware device. An apparatus or system may include any 
combination of hardware, software, and firmware modules. 
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One embodiment of the invention may be described as a method or process 
which is usually depicted as a flowchart, a flow diagram, a structure diagram, or a block 
diagram. Although any of these depictions may describe the operations as a sequential 
process, many of the operations can be performed in parallel or concurrently. In 
addition, the order of the operations may be re-arranged. A process is terminated when 
its operations are completed. A process may correspond to a method, a program, a 
procedure, a method of manufacturing or fabrication, etc. 

While certain exemplary embodiments have been described in detail and shown 
in the accompanying drawings, those of ordinary skill in the art will recognize that the 
invention is not limited to the embodiments described and that various modifications 
may be made to the illustrated and other embodiments of the invention described 
above, without departing from the broad inventive scope thereof. It will be understood, 
therefore, that the invention is not limited to the particular embodiments or 
arrangements disclosed, but is rather intended to cover any changes, adaptations or 
modifications which are within the scope and spirit of the invention as defined by the 
appended claims. 
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WHAT IS CLAIMED IS 

1 1 . A method for separating an image signal into a set of image planes in 

2 accordance with a control signal, the method comprising the operations of: 

3 (a) receiving the control signal and producing a selector signal, via a selector 

4 module; 

5 (b) receiving the selector signal and producing a decision signal, via an edge 

6 processing module; and 

7 (c) receiving the image signal and the decision signal, and outputting a 

8 foreground signal and a background signal, via a foreground/background separation 

9 module, a representation of the current pixel of the image signal being included in at 

10 least one of the foreground signal and the background signal in accordance with the 

1 1 decision signal. 

2. The method of Claim 1 wherein operation (c) further comprises: 
receiving the foreground signal and the background signal; 

filling undefined pixels in the foreground and background signals with values computed 
so as to substantially prevent artifacts and to facilitate good compression ratio, using a 
cleanup module; and 

outputting a final foreground signal and a final background signal. 

1 3. The method of Claim 2 wherein the operation of filling comprises: 

2 extending content of defined pixels in each of the foreground and background signals to 

3 neighboring undefined pixels by filling neighboring undefined pixels with diluted 

4 foreground and background values, respectively, using a dilate module; 

5 averaging non-zero content of the diluted foreground and background values over 

6 minimum coded unit blocks and outputting averaged block values, using a block 

7 average module; and 
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8 filling any remaining undefined pixels with the averaged block values, using a fill 

9 module. 

1 4. A system for separating an image signal into a set of image planes in accordance 

2 with a control signal, the system comprising: 

3 a selector module receiving the control signal and producing a selector signal; 

4 an edge processing module receiving the selector signal and producing a 

5 decision signal; 

6 a foreground/background separation module receiving the image signal and the 

7 decision signal, and outputting a foreground signal and a background signal, a 

8 representation of the current pixel of the image signal being included in at least one of 

9 the foreground signal and the background signal in accordance with the decision signal. 

1 5. The system of Claim 4 wherein the separation module further comprises: 

2 a cleanup module receiving the foreground signal and the background signal, filling 

3 undefined pixels in the foreground and background signals with values computed so as 

4 to substantially prevent compression ringing artifacts and to facilitate good compression 

5 ratio, and outputting a final foreground signal and a final background signal. 

1 6. The system of Claim 5 wherein the cleanup module comprises: 

2 a dilate module extending content of defined pixels in each of the foreground and 

3 background signals to neighboring undefined pixels by filling neighboring undefined 

4 pixels with diluted foreground and background values, respectively; 

5 a block average module averaging non-zero content of the diluted foreground and 

6 background values over minimum coded unit blocks and outputting averaged block 

7 values; and 

8 a fill module filling any remaining undefined pixels with the averaged block values. 
1 7. An article of manufacture comprising: 
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2 a machine usable medium having program code embedded therein, the program code 

3 being used for separating an image signal into a set of image planes in accordance with 

4 a control signal, the program code comprising: 

5 (a) machine readable code to receive and process the control signal, and 

6 produce a selector signal; 

7 (b) machine readable code to receive the selector signal and produce a 

8 decision signal; and 

9 (c) machine readable code to receive and process the image signal and the 

10 decision signal, and outputs a foreground signal and a background signal, a 

11 representation of the current pixel of the image signal being included in at least one of 

12 the foreground signal and the background signal in accordance with the decision signal. 

1 8. The article of manufacture of Claim 7 wherein the machine readable code (c) 

2 further comprises: 

3 machine readable code to receive the foreground signal and the background 

4 signal; 

5 machine readable code to fill undefined pixels in the foreground and background 

6 signals with values computed so as to substantially prevent artifacts and to facilitate 

7 good compression ratio; and 

8 machine readable code to output a final foreground signal and a final background 

9 signal. 

1 9. The article of manufacture of Claim 8 wherein the machine readable code to fill 

2 undefined pixels comprises: 

3 machine readable code to extend content of defined pixels in each of the foreground 

4 and background signals to neighboring undefined pixels by filling neighboring undefined 

5 pixels with diluted foreground and background values, respectively; 

6 machine readable code to average non-zero content of the diluted foreground and 

7 background values over minimum coded unit blocks and to output averaged block 

8 values; and 
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* 9 machine readable code to fill any remaining undefined pixels with the averaged block 
10 values. 
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ABSTRACT 



A method and a system for separating an image signal into a set of image planes in 
accordance with a control signal. The system comprises a selector module, an edge 
processing module and a foreground/background separation module. The selector 
module receives the control signal and produces a selector signal. The edge 
processing module receives the selector signal and produces a decision signal. The 
foreground/background separation module receives the image signal and the decision 
signal, and outputs a foreground signal and a background signal, a representation of the 
current pixel of the image signal being included in at least one of the foreground signal 
and the background signal in accordance with the decision signal. 
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